Goto

Collaborating Authors

 Balearic Islands


Grandes modelos de lenguaje: de la predicci\'on de palabras a la comprensi\'on?

arXiv.org Artificial Intelligence

Large language models, such as the well-known ChatGPT, have brought about an unexpected revolution in the field of artificial intelligence. On the one hand, they have numerous practical applications and enormous potential still to be explored. On the other hand, they are also the subject of debate from scientific, philosophical, and social perspectives: there are doubts about the exact mechanisms of their functioning and their actual capacity for language comprehension, and their applications raise ethical dilemmas. In this chapter, we describe how this technology has been developed and the fundamentals of its operation, allowing us to better understand its capabilities and limitations and to introduce some of the main debates surrounding its development and use. -- Los grandes modelos de lenguaje, como el conocido ChatGPT, han supuesto una inesperada revoluci\'on en el \'ambito de la inteligencia artificial. Por un lado, cuentan con multitud de aplicaciones pr\'acticas y un enorme potencial todav\'ia por explorar. Por otro lado, son tambi\'en objeto de debate, tanto desde el punto de vista cient\'ifico y filos\'ofico como social: hay dudas sobre los mecanismos exactos de su funcionamiento y su capacidad real de comprensi\'on del lenguaje, y sus aplicaciones plantean dilemas \'eticos. En este cap\'itulo describimos c\'omo se ha llegado a esta tecnolog\'ia y los fundamentos de su funcionamiento, permiti\'endonos as\'i comprender mejor sus capacidades y limitaciones e introducir algunos de los principales debates que rodean su desarrollo y uso.


Integrating Inverse and Forward Modeling for Sparse Temporal Data from Sensor Networks

arXiv.org Artificial Intelligence

We present CavePerception, a framework for the analysis of sparse data from sensor networks that incorporates elements of inverse modeling and forward modeling. By integrating machine learning with physical modeling in a hypotheses space, we aim to improve the inter-pretability of sparse, noisy, and potentially incomplete sensor data. The framework assumes data from a two-dimensional sensor network laid out in a graph structure that detects certain objects, with certain motion patterns. Examples of such sensors are magnetometers. Given knowledge about the objects and the way they act on the sensors, one can develop a data generator that produces data from simulated motions of the objects across the sensor field. The framework uses the simulated data to infer object behaviors across the sensor network. The approach is experimentally tested on real-world data, where magnetometers are used on an airport to detect and identify aircraft motions. Experiments demonstrate the value of integrating inverse and forward modeling, enabling intelligent systems to better understand and predict complex, sensor-driven events.


Kissing to Find a Match: Efficient Low-Rank Permutation Representation - Supplementary Material Zorah Lähner University of Siegen

Neural Information Processing Systems

Onofre Martorell Princeton University University of the Balearic Islands Princeton, NJ 08544, United States Investigador ForInDoc del Govern de les Illes Balears yuval.bahat@gmail.com Our supplementary material includes a figure demonstrating the ability of our method to handle large matching problems and a graph showing the influence of the permutation matrix sparsity on the computation speed, as well as accuracy values on the experiments on point cloud assignment. As our method requires devising problem-specific adaptations, the supplementary material also includes a discussion on potential adaptations to our method. Further, it gives a short note on the non-linearity (ReLU) in our approach. Following our shape-matching experiments described in Sec.


Geospatial distributions reflect rates of evolution of features of language

arXiv.org Artificial Intelligence

Quantifying the speed of linguistic change is challenging due to the fact that the historical evolution of languages is sparsely documented. Consequently, traditional methods rely on phylogenetic reconstruction. In this paper, we propose a model-based approach to the problem through the analysis of language change as a stochastic process combining vertical descent, spatial interactions, and mutations in both dimensions. A notion of linguistic temperature emerges naturally from this analysis as a dimensionless measure of the propensity of a linguistic feature to undergo change. We demonstrate how temperatures of linguistic features can be inferred from their present-day geospatial distributions, without recourse to information about their phylogenies. Thus the evolutionary dynamics of language, operating across thousands of years, leaves a measurable geospatial signature. This signature licenses inferences about the historical evolution of languages even in the absence of longitudinal data.


Towards certification: A complete statistical validation pipeline for supervised learning in industry

arXiv.org Artificial Intelligence

The field of Machine Learning (ML) [1, 2] and its broad spectrum of applications has revolutionized a plethora of technological industries in recent years ranging from the energy sector or material sciences to telecommunications, finance or consumer goods, to cite some [3]. In the context of aeronautical engineering and aerospace technologies, the field has embraced ML tools only in recent years, and impact is growing at a rapid pace, ranging from generalpurpose ML-based fluid mechanics [4-6], aeroacoustics [7], wind turbines [8] or aerostructures [9] (including prediction of landing gear loads [10]) to flight trajectories optimization [11] or enhancing predictive maintenance [12, 13]: see the recent and illuminating reviews [14, 15] and references therein. Interestingly, the integration of ML-related tools and ideas in the aeronautical and aerospace industries is still in its infancy. Part of the reason is that any new technology has a necessary adoption curve [16, 17], and the fact that ML-solutions require expert knowledge at the crossroads of computer science and statistics -and a sophisticated operationalization infrastructure (MLOps) [18] - does not facilitate this adoption. However, a deeper reason is probably impeding faster adoption: while ML-technologies promise high performance and reduction in development and operating costs [19] (e.g. by reducing costs related to expensive and lengthy wind tunnel experiments and numerical simulations), ensuring adequate safety remains paramount in aeronautical industries, and ML-based tools are often seen as sophisticated black-boxes that suffer from low degree of trustability, and thus difficult to validate their safety. Therefore, air safety authorities demand rigorous validation and verification processes for these models, and industry leaders have started to propose guidelines and a roadmap on concepts of design assurance for neural network-related technologies [20-22]. However, only very recently industry has started to embrace the complexities of certifying ML models [23-27], prompting the initiation of discussions around the development of guidelines and a roadmap for design assurance, especially concerning network-related technologies. This pressing need underscores the imperative for collaborative efforts within the industry to establish robust validation frameworks that not only meet regulatory standards but also address the evolving challenges posed by ML integration. This has indeed been well understood and undertaken by Airbus who has established an internal working group on verification and validation of surrogate models in the frame of loads and stress domains.


The potential of LLM-generated reports in DevSecOps

arXiv.org Artificial Intelligence

Alert fatigue is a common issue faced by software teams using the DevSecOps paradigm. The overwhelming number of warnings and alerts generated by security and code scanning tools, particularly in smaller teams where resources are limited, leads to desensitization and diminished responsiveness to security warnings, potentially exposing systems to vulnerabilities. This paper explores the potential of LLMs in generating actionable security reports that emphasize the financial impact and consequences of detected security issues, such as credential leaks, if they remain unaddressed. A survey conducted among developers indicates that LLM-generated reports significantly enhance the likelihood of immediate action on security issues by providing clear, comprehensive, and motivating insights. Integrating these reports into DevSecOps workflows can mitigate attention saturation and alert fatigue, ensuring that critical security warnings are addressed effectively.


Multi-scale Conditional Generative Modeling for Microscopic Image Restoration

arXiv.org Artificial Intelligence

The advance of diffusion-based generative models in recent years has revolutionized state-of-the-art (SOTA) techniques in a wide variety of image analysis and synthesis tasks, whereas their adaptation on image restoration, particularly within computational microscopy remains theoretically and empirically underexplored. In this research, we introduce a multi-scale generative model that enhances conditional image restoration through a novel exploitation of the Brownian Bridge process within wavelet domain. By initiating the Brownian Bridge diffusion process specifically at the lowest-frequency subband and applying generative adversarial networks at subsequent multi-scale high-frequency subbands in the wavelet domain, our method provides significant acceleration during training and sampling while sustaining a high image generation quality and diversity on par with SOTA diffusion models. Experimental results on various computational microscopy and imaging tasks confirm our method's robust performance and its considerable reduction in its sampling steps and time. This pioneering technique offers an efficient image restoration framework that harmonizes efficiency with quality, signifying a major stride in incorporating cutting-edge generative models into computational microscopy workflows.


Swap distance minimization beyond entropy minimization in word order variation

arXiv.org Artificial Intelligence

Here we consider the problem of all the possible orders of a linguistic structure formed by $n$ elements, for instance, subject, direct object and verb ($n=3$) or subject, direct object, indirect object and verb ($n=4$). We investigate if the frequency of the $n!$ possible orders is constrained by two principles. First, entropy minimization, a principle that has been suggested to shape natural communication systems at distinct levels of organization. Second, swap distance minimization, namely a preference for word orders that require fewer swaps of adjacent elements to be produced from a source order. Here we present average swap distance, a novel score for research on swap distance minimization, and investigate the theoretical distribution of that score for any $n$: its minimum and maximum values and its expected value in die rolling experiments or when the word order frequencies are shuffled. We investigate whether entropy and average swap distance are significantly small in distinct linguistic structures with $n=3$ or $n=4$ in agreement with the corresponding minimization principles. We find strong evidence of entropy minimization and swap distance minimization with respect to a die rolling experiment. The evidence of these two forces with respect to a Polya urn process is strong for $n=4$ but weaker for $n=3$. We still find evidence of swap distance minimization when word order frequencies are shuffled, indicating that swap distance minimization effects are beyond pressure to minimize word order entropy.


Adaptive control of recurrent neural networks using conceptors

arXiv.org Artificial Intelligence

Recurrent Neural Networks excel at predicting and generating complex high-dimensional temporal patterns. Due to their inherent nonlinear dynamics and memory, they can learn unbounded temporal dependencies from data. In a Machine Learning setting, the network's parameters are adapted during a training phase to match the requirements of a given task/problem increasing its computational capabilities. After the training, the network parameters are kept fixed to exploit the learned computations. The static parameters thereby render the network unadaptive to changing conditions, such as external or internal perturbation. In this manuscript, we demonstrate how keeping parts of the network adaptive even after the training enhances its functionality and robustness. Here, we utilize the conceptor framework and conceptualize an adaptive control loop analyzing the network's behavior continuously and adjusting its time-varying internal representation to follow a desired target. We demonstrate how the added adaptivity of the network supports the computational functionality in three distinct tasks: interpolation of temporal patterns, stabilization against partial network degradation, and robustness against input distortion. Our results highlight the potential of adaptive networks in machine learning beyond training, enabling them to not only learn complex patterns but also dynamically adjust to changing environments, ultimately broadening their applicability.


Computational lexical analysis of Flamenco genres

arXiv.org Artificial Intelligence

Flamenco, recognized by UNESCO as part of the Intangible Cultural Heritage of Humanity, is a profound expression of cultural identity rooted in Andalusia, Spain. However, there is a lack of quantitative studies that help identify characteristic patterns in this long-lived music tradition. In this work, we present a computational analysis of Flamenco lyrics, employing natural language processing and machine learning to categorize over 2000 lyrics into their respective Flamenco genres, termed as $\textit{palos}$. Using a Multinomial Naive Bayes classifier, we find that lexical variation across styles enables to accurately identify distinct $\textit{palos}$. More importantly, from an automatic method of word usage, we obtain the semantic fields that characterize each style. Further, applying a metric that quantifies the inter-genre distance we perform a network analysis that sheds light on the relationship between Flamenco styles. Remarkably, our results suggest historical connections and $\textit{palo}$ evolutions. Overall, our work illuminates the intricate relationships and cultural significance embedded within Flamenco lyrics, complementing previous qualitative discussions with quantitative analyses and sparking new discussions on the origin and development of traditional music genres.